New substitutions on NS1 protein from influenza A (H1N1) virus: Bioinformatics analyses of Indian strains isolated from 2009 to 2020

Abstract Background and Aims Nonstructural (NS1) protein is mainly involved in virulence and replication of several viruses, including influenza virus A (H1N1); surveillance of the latter started in India in 2009. The objective of this study was to identify the new substitutions in NS1 protein from the influenza virus A (H1N1) pandemic 2009 (pdm09) strain isolated in India. Methods The sequences of NS1 proteins from influenza A(H1N1) pdm09 strains isolated in India were obtained from publicly available databases. Multiple sequence alignment and phylogeny analyses were performed to confirm the “consistent substitutions” on NS1 protein from H1N1 (pdm09) Indian strains. Here, “consistent substitutions” were defined as the substitutions observed in all the sequences isolated in a year. Comparative analyses were performed among NS1 Indian sequences from A(H1N1) pdm09, A (H1N1) seasonal and A(H3N2) strains, and from A (H1N1) pdm09 global strains. Results Eight substitutions were identified in the NS1 Indian sequence from the A(H1N1) pdm09 strain, two in RBD, five in ED, and one in the linker region. Three new substitutions were reported in this study at NS1 sequence positions 2, 80, and 155, which evolved within 2015–2019 and became “consistent.” These new substitutions were associated with conservative paired substitutions in the alternative domains of the NS1 protein. Three paired substitutions were (i) D2E and E125D, (ii) T80A and A155T, and (iii) E55K and K131E. Conclusions This study indicates the continuous evolution of NS1 protein from the influenza A virus. The new substitutions at positions 2 and 80 occurred in the RNA binding and eIF4GI binding domains. The D2E substitution evolved simultaneously with the E125D substitution that involved viral replication. The third new substitution at position 155 occurred in the PI3K binding domain. The possible consequences of these substitutions on host–pathogen interactions are subject to further experimental and computational verification.

The A(H1N1) pdm09 strain has shown increased virulence due to its swine transmutability to human host. 7,8 The segmented genome of the influenza A virus helps the genetic reassortment of the virus with the possible emergence of new subtypes. 9 New subtypes often evade the host immune system more efficiently, leading to increased viral spread and pandemic outbreaks. India was more severe than in 2009. 10 Considering the recent epidemic of the influenza virus in India, it is essential to understand the possible genetic drift and shift of the virus to predict future pandemics possibilities.
The protein of choice in this study was the NS1 nonstructural protein, as it invades the host immune system and blocks different host signaling pathways. NS1 protein modulates host antiviral responses and promotes viral replication. Hence, it plays a pivotal role in viral pathogenesis. 11 The NS1 is a multifunctional protein 12 involved in posttranslational regulation of the influenza virus life cycle. It binds virion RNA, [12][13][14][15] poly(A)-containing RNA, 15 human U6 snRNA, 16 and many more viral and cellular proteins. NS1 protein has two major domains, the N-terminal RNA binding domain (RBD) (residues 1-73) and the C-terminal effector domain (ED) (residues 84 to the end). [17][18][19] The RBD protects the virus against the antiviral state induced by interferon alpha/beta by blocking the activation of the 2′−5′ oligo (A) synthetases/RNase L pathway. 11,12,20 Several host factors (RNA/protein) interact with NS1 RBD and ED domains. Some of these host factors are protein kinase R (PKR), 11,12,21 retinoic-acid inducible gene I, 12

| Construction of phylogenetic trees and evolution of amino acid substitutions
MEGAX software 29,30 was used to construct the phylogenetic trees for influenza A virus NS1 proteins from Indian isolates. As a prerequisite to phylogenetic tree construction, multiple sequence alignment was performed using CLUSTALW. 31 Following pairwise alignment parame- matrix-based model. 32 ML is a statistical method to infer probability distribution and assign probabilities to predicted phylogenetic trees.
JTT substitution model was used to assess the likelihood of particular substitutions. No additional options were used to handle gaps and missing data (i.e., all sites were used). Initial heuristic search tree(s) were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model. 33 Five hundred steps of bootstrap replication were performed to construct each phylogenetic tree. The final topology was selected based on the superior log-likelihood value. Visualization of the phylogenetic tree was done using MEGAX.  (Table 1). Moreover, the "intermediate" group perpetually partitioned the pdm09 strains before and after 2010.

| Recent substitutions deduced from multiple sequence analyses, phylogeny, and evolutionary rate
We have identified 33 sequence positions that varied between seasonal and pandemic strains (Table S1). These changes (e.g., XIY) were denoted as substitution (Y) of the consensus amino acid (X) at sequence position I. For example, in the "intermediate" group, the consensus amino acid, glutamic acid (E), was substituted by glutamine (Q) at sequence position 55 (E55Q). One more frequently used term, "consistent substitution," is defined here as substitution observed in all the sequences isolated in a year. For example, the N205S substitution was consistent from 2013 onwards ( Table 2). This substitution was also reported earlier in the strains from India. 49 The L90I substitution was first observed in 2011 (in 10 out of 23 sequences) and became consistent in 2014; the substitution was also reported earlier from India. 49 Three conservative paired substitutions, namely (i) E55K and K131E, (ii) D2E and E125D, and (iii) T80A and A155T, were   Table 2). The D2E substitution on NS1 protein from A(H1N1) pdm09 was reported earlier, only once from Russia. 52 To further confirm our observation, we had analyzed the global sequence, isolated from 2009 to 2020. The analysis showed that these three unique substitutions in NS1 protein ( Figure 2A) identified from India were simultaneously detected in global NS1 sequences ( Figure 2B).
Next, we asked whether these three new substitutions were unique to A(H1N1) strains. Hence, we have analyzed the NS1 sequences available from the A(H3N2) strain circulating in India.
The average pairwise sequence similarity computed across the strains of influenza viruses circulating in India was 87.5% (Table 3).
The similarities among the strains were higher than those across  (Table S2). These results suggested that the three new substitutions in NS1 protein were unique to A(H1N1) pdm09 strains.
To ascertain the above substitutions in NS1 protein from Indian A(H1N1) pdm09 strains, we have estimated position-specific evolution rate. These rates were scaled so that the average evolutionary rate across all the sites was one. Thus, the sites showing a rate less than one evolved slower than the average evolutionary rate and vice versa. , showed an evolutionary rate greater than two.

| Host factor (protein/RNA) interactions with influenza A(H1N1) NS1 protein involving the new substitutions identified in this study
As crystal structures are not available for complete NS1 protein from the H1N1 strain, the interacting residues are depicted on the crystal structure obtained from the NS1 protein of the H6N6 strain. The NS1 protein sequences from these two strains are mostly conserved ( Figure 4A). The dsRNA binding residues are buried within the protein, whereas those from nuclear localization signal binding are mostly exposed on the protein surface ( Figure 4B). Similarly, residues 123-127 from the ED domain of influenza virus NS1 proteins (those that interact with the PKR domain) are buried within the protein core. protein via classical alpha/beta nuclear import pathway. 19,[45][46][47] The key residues and the interaction patterns with the partner molecules vary in other viral subtypes. In the case of the zika virus NS1 protein, membrane, and antibody binding regions are exposed on the protein surface. The membrane-binding residues are primarily hydrophobic, and antibody binding residues are mostly polar. 56 In this study, we identified ribosomal complexes to viral mRNA. 42 Hence, substitutions at these two positions (2 and 80), presumably, would alter the viral mRNA translation.
One notable observation was residue position 2 has a conserved paired substitution with residue position 125 from 2016 onwards ( Table 2).
Residue position 125 (along with residue positions 108 and 189) is also involved in viral replication. 48 Conceptualization, formal analysis, investigation, methodology, project administration, supervision, validation, writing-review & editing.

CONFLICTS OF INTEREST
The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT
The data analyzed here is publicly available at Influenza virus database-NCBI (https://www.ncbi.nlm.nih.gov) and GISAID Initiative.

TRANSPARENCY STATEMENT
This manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted.